Efficient Classification from Multiple Heterogeneous Databases

نویسندگان

  • Xiaoxin Yin
  • Jiawei Han
چکیده

With the fast expansion of computer networks, it is inevitable to study data mining on heterogeneous databases. In this paper we proposeMDBM, an accurate and efficient approach for classification on multiple heterogeneous databases. We propose a regression-based method for predicting the usefulness of inter-database links that serve as bridges for information transfer, because such links are automatically detected and may or may not be useful or even valid. Because of the high cost of inter-database communication, MDBM employs a new strategy for cross-database classification, which finds and performs actions with high benefit-to-cost ratios. The experiments show that MDBM achieves high accuracy in cross-database classification, with much higher efficiency than previous approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HeteroClass: A Framework for Effective Classification from Heterogeneous Databases

Classification is an important data mining task and it has been studied from different perspectives. Recently multi-relational classification algorithms has been studied due to many real-world applications. However, current work has generally assumed that all the needed data to build an accurate prediction model resides in a single database. Many practical settings, however, require that we com...

متن کامل

Applying a Data Miner To Heterogeneous Schema Integration

An application of data mining techniques to heterogeneous database schema integration is introduced. We use attribute-oriented induction to mine for characteristic and classification rules about individual attributes from heterogeneous databases. Each mining request is conditioned on a subset of attributes identified as "common" between the multiple databases. We develop a method to compare the...

متن کامل

Comparative Study on Text Pattern Matching for Heterogeneous System

Shikha Pandey Asst. Professor (CSE) Rungta College Of Engineering & Technology Bhilai, Chhattisgarh, INDIA [email protected] Abstract— Pattern-matching has been routinely used in various computer applications, for example, in editors, retrieval of information either textual, image, or sound and searching nucleotide or amino acid sequence patterns in genome and protein sequence databases...

متن کامل

Rough Set Theory and Fuzzy Logic Based Warehousing of Heterogeneous Clinical Databases

Large amounts of data about the patients with their medical conditions are presented in the Medical databases. Analyzing all these databases is one of the difficult tasks in the medical environment. In order to warehouse all these databases and to analyze the patient‟s condition, we need an efficient data mining technique. In this paper, an efficient data mining technique for warehousing clinic...

متن کامل

A Heterogeneous Naive-Bayesian Classifier for Relational Databases

© A Heterogeneous Naive-Bayesian Classifier for Relational Databases Geetha Manjunath, M Narasimha Murty, Dinkar Sitaram HP Laboratories HPL-2009-225 Relational databases, Classification, Data Mining, RDF Most enterprise data is distributed in multiple relational databases with expert-designed schema. Application of single-table data mining techniques to distributed relational data not only inc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005